Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware LAPACK Working Note ♯217

نویسندگان

Emmanuel Agullo

Bilel Hadri

Hatem Ltaief

Jack Dongarra

چکیده

The emergence and continuing use of multi-core architectures require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) is a project that aims to achieve both high performance and portability across a wide range of multi-core architectures. We present in this paper a comparative study of PLASMA’s performance against established linear algebra packages (LAPACK and ScaLAPACK), against new approaches at parallel execution (Task Based Linear Algebra Subroutines – TBLAS), and against equivalent commercial software offerings (MKL, ESSL and PESSL). Our experiments were conducted on one-sided linear algebra factorizations (LU, QR and Cholesky) and used multi-core architectures (based on Intel Xeon EMT64 and IBM Power6). The performance results show improvements brought by new algorithms on up to 32 cores – the largest multi-core system we could access.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling Two-sided Transformations using Algorithms-by-Tiles on Multicore Architectures LAPACK Working Note #214

The objective of this paper is to describe, in the context of multicore architectures, different scheduler implementations for the two-sided linear algebra transformations, in particular the Hessenberg and Bidiagonal reductions which are the first steps for the standard eigenvalue problems and the singular value decompositions respectively. State-of-the-art dense linear algebra softwares, such ...

متن کامل

Batched matrix computations on hardware accelerators based on GPUs

Scientific applications require solvers that work on many small size problems that are independent from each other. At the same time, the high-end hardware evolves rapidly and becomes ever more throughput-oriented and thus there is an increasing need for an effective approach to develop energy-efficient, high-performance codes for these small matrix problems that we call batched factorizations....

متن کامل

Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited LAPACK Working Note #208

The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multicore architectures using algorithms-by-tiles. In particular, the Block Hessenberg Reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenva...

متن کامل

OpenCL Evaluation for Numerical Linear Algebra Library Development

With the help of of CUDA [7], [6], many applications improved their performance by using GPUs. In our project called Matrix Algebra on GPU and Multicore Architectures (MAGMA) [10], we mainly focus on dense linear algebra routines similar to those from LAPACK [1]. Other than CUDA, there exist other frameworks that allow platformindependent programming for GPUs. The main three frameworks are: 1) ...

متن کامل

A Comparative Study of Multi-Attribute Continuous Double Auction Mechanisms

Auctions have been as a competitive method of buying and selling valuable or rare items for a long time. Single-sided auctions in which participants negotiate on a single attribute (e.g. price) are very popular. Double auctions and negotiation on multiple attributes create more advantages compared to single-sided and single-attribute auctions. Nonetheless, this adds the complexity of the auctio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware LAPACK Working Note ♯217

نویسندگان

چکیده

منابع مشابه

Scheduling Two-sided Transformations using Algorithms-by-Tiles on Multicore Architectures LAPACK Working Note #214

Batched matrix computations on hardware accelerators based on GPUs

Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited LAPACK Working Note #208

OpenCL Evaluation for Numerical Linear Algebra Library Development

A Comparative Study of Multi-Attribute Continuous Double Auction Mechanisms

عنوان ژورنال:

اشتراک گذاری